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z 



A soluble fusion protein comprising a non-toxin protein sequence and a 
i portxctn of the Clostridium botulinum type A toxin, said portion of the 

rt< " ~ " ' : ^ " .m botulinum type A toxin comprising a portion of the sequence of SEQ 



2. The fusion protein of claim 1, wherein said portion! of the Clostridium ^ 
botulinum type A toxin sequence comprises SEQ Id(nO: 23 v 



3. The fusion protein of claim 1, wherein said non-toxin protein sequence 
comprises a pol y-histidine tract. 

4. The fusion protein of zlaim 3, which rornprises SEQ ID 



5. The fusion protein of claim 1, wherein said fusion protein is substantially 
endotoxin-f ree . 

'? 

6. A host cell containing a recombinant expression vector, said vector 
encoding a protein comprising at least a portion of a Clostridium botulinum 
type A toxin protein sequence of SEQ ID NO: 28, and wherein said host cell is 
capable of expressing said protein as a soluble protein in said host cell at a 
level greater than or equal to 0.75% of the total cellular protein. / — 



7. The host cell of claim 6, wherein said portion of a toxin comprises SEQ ID 
NO:23. 



8. The- host cell of claim 6, wherein said fusion protein comprises SEQ ID 
NO:26. 



9. The host cell of claim 6, wherein said host cell is capable of expressing 
said protein in said host cell at a level greater than or equal to 20% of the 
total cellular protein. 

10. A soluble fusion protein, comprising at least a portion of Clostridium 
botulinum C fragment linked to a poly -histidine tag. 
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5,919,665 
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-continued 



Arg Val Tyr II© Asn Val Val Val Lys Asn Lys Glu Tyr Arg Leu Ala 
325 330 335 

ACC AAT GOT TCT CAG GCT GGT GTA GAA AAG ATC TTG TCT GCT CTG GAA 1056 
Thr Asn Ala Ser Gin Ala Gly Val Glu Lye lie Leu Ser Ala Leu Glu 
340 345 " 350 

ATC CCG GAC GTT GGT AAT CTG TCT CAG GTA GTT GTA ATG AAA TCC AAG 1104 
lie Pro Asp Val Gly Asn Leu Ser Gin Val Val Val Met Lys Ser Lys 
355 360 36S 

AAC GAC CAG GGT ATC ACT AAC AAA TGC AAA ATG AAT CTG CAG GAC AAC 1152 

Asn Asp Gin Gly He Thr Asn Lys Cys Lys Met Asn Leu Gin Asp Asn 
370 375 380 

AAT GGT AAC GAT ATC GGT TTC ATC GGT TTC CAC CAG TTC AAC AAT ATC 1200 
Asn Gly Asn Asp lie Gly Phe lie Gly Phe His Gin Phe Asn Asn He 
385 390 395 400 

GCT AAA CTG GTT GCT TCC AAC TGG TAC AAT CGT CAG ATC GAA CGT TCC 1248 
Ala Lye Leu Val Ala Ser Asn Trp Tyr Asn Arg Gin He Glu Arg Ser 
405 410 415 

TCT CGC ACT CTG GGT TGC TCT TGG GAG TTC ATC CCG GTT GAT GAC GGT 1296 
Ser Arg Thr Leu Gly Cys Ser Trp Glu Phe He Pro Val Asp Asp Gly 
420 " 425 430 

TGG GGT GAA CGT CCG CTG TAACCCGGGA AAGCTT 1330 
Trp Gly Glu Arg Pro Leu 
435 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Met Ala Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Asn He He 
15 10 15 

Asn Thr Ser He Leu Asn Leu Arg Tyr Glu Ser Asn His Leu He Asp 
20 25 30 

Leu Ser Arg Tyr Ala Ser Lys He Asn He Gly Ser Lys Val Asn Phe 
35 40 45 

Asp Pro He Asp Lys Asn Gin He Gin Leu Phe Asn Leu Glu Ser Ser 
50 55 60 

Lys He Glu Val He Leu Lys Asn Ala He Val Tyr Asn Ser Met Tyr 
65 70 75 80 

Glu Asn Phe Ser Thr Ser Phe Trp He Arg He Pro Lys Tyr Phe Asn 
85 90 95 

Ser He Ser Leu Asn Asn Glu Tyr Thr He He Asn Cys Met Glu Ash 
100 105 110 

Asn Ser Gly Trp Lys Val Ser Leu Asn Tyr Gly Glu He He Trp Thr 
115 120 125 

Leu Gin Asp Thr Gin Glu He Lys Gin Arg Val Val Phe Lys Tyr Ser 
130 135 ' 140 

Gin Met He Asn He Ser Asp Tyr He Asn Arg Trp He Phe Val Thr 
145 150 155 160 

He Thr Asn Asn Arg Leu Asn Asn Ser Lys He Tyr He Asn Gly Arg 
165 170 175 

Leu He Asp Gin Lys Pro He Ser Asn Leu Gly Asn He His Ala Ser 
180 185 190 

Asn Asn He Met Phe Lys Leu Asp Gly Cys Arg Asp Thr His Arg Tyr 
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-continued 



195 200 205 . 

lie Trp lie Lye Tyr Phe Asn Leu Phe Asp Lye Glu Leu Asn Glu Lye 
210 215 220 

Glu lie Lys Asp Leu Tyr Asp Asn Gin Ser Asn Ser Gly He Leu Lys 
225 ~ 230 235 240 

Asp Phe Trp Gly Asp Tyr Leu Gin Tyr Asp Lys Pro Tyr Tyr Met Leu 
245 250 255 

Asn Leu Tyr Asp Pro Aen Lys Tyr Val Asp Val Asn Asn Val Gly He 
260 265 270 

Arg Gly Tyr Met Tyr Leu Lys Gly Pro Arg Gly Ser Val Met Thr Thr 
275 280 285 

Asn He Tyr Leu Asn Ser Ser Leu Tyr Arg Gly Thr Lys Phe He He 
290 295 300 

Lys Lys Tyr Ala Ser Gly Asn Lys Asp Asn He Val Arg Asn Asn Asp 
305 " 310 315 320 

Arg Val Tyr He Asn Val Val Val Lys Asn Lys Glu Tyr Arg Leu Ala 
325 330 335 

Thr Asn Ala Ser Gin Ala Gly Val Glu Lys He Leu Ser Ala Leu Glu 
340 345 350 

He Pro Asp Val Gly Asn Leu Ser Gin Val Val Val Met Lys Ser Lys 
355 360 365 

Asn Abp Gin Gly He Thr Asn Lys Cys Lys Met Asn Leu Gin Asp Asn 
370 375 380 

Asn Gly Asn Asp He Gly Phe He Gly Phe His Gin Phe Asn Asn He 
385 390 395 400 

Ala Lys Leu Val Ala Ser Asn Trp Tyr Asn Arg Gin He Glu Arg Ser 
405 410 415 

Ser Arg Thr Leu Gly Cys Ser Trp Glu Phe He Pro Val Asp Asp Gly 
420 425 430 

Trp Gly Glu Arg Pro Leu 
435 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 23 amino acids 

(B ) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Gly His His His His His His His His His His Ser Ser Gly His 
15 10 15 

He Glu Gly Arg His Met Ala 
20 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1402 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY » CDS 

(B) LOCATION: 1..1386 
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Pro Arg Gly Ser Val Mat Thr Thr Asn lie Tyr Lou Asn Sor Ser Leu 
305 310 315 320 

TAC CGT GGT ACC AAA TTC ATC ATC AAG AAA TAC GCG TCT GGT AAC AAG 1008 
Tyr Arg Gly Thr Lys Phe lie lie Lye Lye Tyr Ala Ser Gly Aon Lyo 
325 330 335 

GAC AAT ATC GTT CGC AAC AAT GAT CGT GTA TAC ATC AAT GTT GTA GTT 1056 
Asp Asn lie Val Arg Aon Asn Asp Arg Val Tyr He Asn Val Val Val 
340 345 350 

AAG AAC AAA GAA TAC CGT CTG GCT ACC AAT GCT TCT CAG GCT GGT GTA 1104 
Lys Asn Lys Glu Tyr Arg Leu Ala Thr Asn Ala Ser Gin Ala Gly Val 
355 360 365 

GAA AAG ATC TTG TCT GCT CTG GAA ATC CCG GAC GTT GGT AAT CTG TCT 1152 
Glu Lys He Leu Ser Ala Leu Glu He Pro Asp Val Gly Asn Leu Ser 
370 375 380 

CAG GTA GTT GTA ATG AAA TCC AAG AAC GAC CAG GGT ATC ACT AAC AAA 1200 
Gin Val Val Val Met Lys Ser Lys Asn Asp Gin Gly He Thr Asn Lys 
385 390 395 400 

TGC AAA ATG AAT CTG CAG GAC AAC AAT GGT AAC GAT ATC GGT TTC ATC 1248 
Cys Lys Met Asn Leu Gin Asp Asn Asn Gly Asn Asp He Gly Phe He 
405 * 410 415 

GGT TTC CAC CAG TTC AAC AAT ATC GCT AAA CTG GTT GCT TCC AAC TGG 1296 
Gly Phe His Gin Phe Asn Asn He Ala Lys Leu Val Ala Ser Asn Trp 
420 425 430 

TAC AAT CGT CAG ATC GAA CGT TCC TCT CGC ACT CTG GGT TGC TCT TGG 1344 
Tyr Asn Arg Gin He Glu Arg Ser Ser Arg Thr Leu Gly Cys Ser Trp 
435 440 445 

GAG TTC ATC CCG GTT GAT GAC GGT TGG GGT GAA CGT CCG CTG 13B6 
Glu Phe He Pro Val Asp Asp Gly Trp Gly Glu Arg Pro Leu 
450 455 460 

TAACCCGGGA AAGCTT 1402 



(2) INFORMATION FOR SEQ ID 110:26 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 462 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 

Met Gly His His His His Hie His His His His His Ser Ser Gly His 
15 10 15 

He Glu Gly Arg His Met Ala Ser Met Ala Arg Leu Leu Ser Thr Phe 
20 25 30 

Thr Glu Tyr He Lys Asn He He Asn Thr Ser He Leu Asn Leu Arg 
35 40 45 

Tyr Glu Ser Asn His Leu He Asp Leu Ser Arg Tyr Ala Ser Lys He 
50 55 60 

Asn He Gly Ser Lys Val Asn Phe Asp Pro He Asp Lys Asn Gin He 
65 70 75 80 

Gin Leu Phe Asn Leu Glu Ser Ser Lys He Glu Val He Leu Lys Asn 
85 90 95 

Ala He Val Tyr Aen Ser Met Tyr Glu Asn Pho Ser Thr Ser Phe Trp 
100 105 110 

He Arg He Pro Lys Tyr Phe Asn Ser He Ser Leu Asn Asn Glu Tyr 
115 ' 120 125 



Thr He He Asn Cys Met Glu Asn Asn Ser Gly Trp Lys Val Ser Leu 
130 135 140 
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Asn Tyr Gly Glu He He Trp Thr Leu Gin Asp Thr Gin Glu He Lye 
145 " 150 155 160 

Gin Arg Val Val Phe Lys Tyr Ser Gin Met He Asn He Ser Asp Tyr 
165 170 175 

He Asn Arg Trp He Phe Val Thr He Thr Asn Asn Arg Leu Asn Asn 
180 185 190 

Ser Lys He Tyr He Asn Gly Arg Leu He Asp Gin Lys Pro He Ser 
195 200 205 

Asn Leu Gly Asn He His Ala Ser Asn Asn He Met Phe Lys Leu Asp 
210 215 220 

Gly Cys Arg Asp Thr His Arg Tyr He Trp He Lys Tyr Phe Asn Leu 
225 230 235 240 

Phe Asp Lys Glu Leu Asn Glu Lys Glu He Lys Asp Leu Tyr Asp Asn 
245 250 255 

Gin Ser Asn Ser Gly He Leu Lys Asp Phe Trp Gly Asp Tyr Leu Gin 
260 265 270 

Tyr Asp Lys Pro Tyr Tyr Met Leu Asn Leu Tyr Asp Pro Asn Lys Tyr 
275 280 2B5 

Val Asp Val Asn Asn Val Gly He Arg Gly Tyr Met Tyr Leu Lys Gly 
290 295 300 

Pro Arg Gly Ser Val Met Thr Thr Asn He Tyr Leu Asn Ser Ser Leu 
305 310 315 320 

Tyr Arg Gly Thr Lys Phe He He Lys Lys Tyr Ala Ser Gly Asn Lys 
325 330 335 

Asp Asn He Val Arg Asn Asn Asp Arg Val Tyr He Asn Val Val Val 
340 345 350 

Lys Asn Lys Glu Tyr Arg Leu Ala Thr Asn Ala Ser Gin Ala Gly Val 
355 360 365 

Glu Lys He Leu Ser Ala Leu Glu He Pro Asp Val Gly Asn Leu Ser 
370 375 380 

Gin Val Val Val Met Lys Ser Lys Asn Asp Gin Gly He Thr Asn Lys 
385 390 395 400 

Cys Lys Met Asn Leu Gin Asp Asn Asn Gly Asn Asp He Gly Phe He 
405 410 415 

Gly Phe His Gin Phe Asn Asn He Ala Lys Leu Val Ala Ser Asn Trp 
420 425 ~ 430 

Tyr Asn Arg Gin He Glu Arg Ser Ser Arg Thr Leu Gly Cys Ser Trp 
435 440 445 

Glu Phe He Pro Val Asp Asp Gly Trp Gly Glu Arg Pro Leu 
450 455 460 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3891 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESSi double 

(D) TOPOLOGY j linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..3888 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOt27: 

ATG CAA TTT GTT AAT AAA CAA TTT AAT TAT AAA GAT CCT GTA AAT GGT 48 
Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 
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TAA 



3891 



(2) INFORMATION FOR SEQ ID HO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Met Gin Phe Val Aen Lya Gin Phe Aen Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

Val Asp lie Ala Tyr He Lys He Pro Asn Val Gly Gin Met Gin Pro 
20 * 25 30 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser Ha Val 
100 105 110 

Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Aen Pro Leu Leu 
195 200 * 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Aen Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 



325 



330 



335 



Lys Phe Asp Lye' Leu Tyr Lye Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 
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Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 * 375 380 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 * 430 

Gly lie He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 460 

Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

Tie Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn Tie Ser Leu 
485 490 495 

Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 510 

Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 525 

Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 

565 570 575 

Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

Asp Lys He Ala Asp He Thr He He He Pro Tyr He Gly Pro Ala 
625 ~ 630 635 640 

Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
660 665 670 

He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 

Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 " 700 

Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr ABn 
740 745 750 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe ABn He Asp Asp 
755 760 765 



Leu Ser Ser Lys Leu Asn Glu Ser lie Asn Lys Ala Met He Asn He 
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Gly Val Glu Lys He Leu Ser Ala Leu Glu lie Pro Aep Val Gly Asn 
1205 1210 1215 

Leu Ser Gin Val Val Val Net Lys Ser Lys Asn Asp Gin Gly He Thr 
1220 1225 1230 

Asn Lys Cys Lye Met Asn Leu Gin Asp Asn Asn Gly Asn Asp He Gly 
1235 1240 1245 

Phe He Gly Phe His Gin Phe Asn Asn He Ala Lys Leu Val Ala Ser 
1250 1255 1260 

Asn Trp Tyr Asn Arg Gin He Glu Arg Ser Ser Arg Thr Leu Gly Cys 
1265 1270 1275 1280 

Ser Trp Glu Phe He Pro Val Asp Aop Gly Trp Gly Glu Arg Pro Leu 
1285 1290 1295 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Not Relevant 

(D) TOPOLOGY: Not Relevant 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Modified-site 

(B) LOCATION I 12 

(D) OTHER INFORMATION: /note- "The asparagine residue at 
this position contains an amide group." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

Cys Gin Thr He Asp Gly Lys Lys Tyr Tyr Phe Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B ) TYPE: amino acid 

(C) STRANDEDNESS: Not Relevant 

(D) TOPOLOGY: Not Relevant 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Hie His His His His 
1 5 



I claim: 

1. A soluble fusion protein comprising a non-toxin protein 
sequence and a portion of the Clostridium botulinum type A 
toxin, said portion of the Clostridium botulinum type A toxin 
comprising a portion of the sequence of SEQ ID NO:28. 

2. The fusion protein of claim 1, wherein said portion of 
the Clostridium botulinum type A toxin sequence comprises 
SEQ ID NO:23. 

3. The fusion protein of claim 1, wherein said non-toxin 
protein sequence comprises a poly-histidine tract. 

4. The fusion protein of claim 3, which comprises SEQ ID 
NO:26. 

5. The fusion protein of claim 1, wherein said fusion 
protein is substantially endotoxin-free. 

6. A host cell containing a recombinant expression vector, 
said vector encoding a protein comprising at least a portion 
of a Clostridium botulinum type A toxin protein sequence of 



SEQ ID NO:28, and wherein said host cell is capable of 
expressing said protein as a soluble protein in said host cell 
at a level greater than or equal to 0.75% of the total cellular 
protein. 

7. The host cell of claim 6, wherein said portion of a toxin 
comprises SEQ ID NO:23. 

8. The host cell of claim 6, wherein said fusion protein 
comprises SEQ ID NO:26. 

9. The host cell of claim 6, wherein said host cell is 
capable of expressing said protein in said host cell at a level 
greater than or equal to 20% of the total cellular protein. 

10. A soluble fusion protein, comprising at least a portion 
of Clostridium botulinum C fragment linked to a poly- 
histidine tag. 

***** 



55 
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Gly Val Glu Lys lie Lou Ser Ala Leu Glu lie Pro Asp Val Gly Asn 
1205 1210 1215 

Leu Ser Gin Val Val Val Met Lys Ser Lys Asn Asp Gin Gly He Thr 
1220 1225 1230 

Asn Lys Cys Lya Met Asn Leu Gin Asp Asn Asn Gly Asn Asp He Gly 
1235 1240 1245 

Phe He Gly Phe His Gin Phe Asn Asn He Ala Lys Leu- Val Ala Ser 
1250 1255 1260 

Asn Trp Tyr Asn Arg Gin He Glu Arg Ser Ser Arg Thr Leu Gly Cys 
1265 1270 1275 1280 

Ser Trp Glu Phe He Pro Val Aap Asp Gly Trp Gly Glu Arg Pro Leu 
1285 1290 1295 



(2) INFORMATION FOR SEQ ID HO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: Not Relevant 

(D) TOPOLOGY: Not Relevant 

(ii) MOLECULE TYPE : peptide 

(ix) FEATURE: 

(A) NAME/ KEY: Modified- site 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /note- "The asparagine residue at 
this position contains an amide group." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

Cys Gin Thr He Asp Gly Lys Lys Tyr Tyr Phe Asn 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Not Relevant 

(D) TOPOLOGY : Not Relevant 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

His His His His His 
1 5 



I claim: 

1. A soluble fusion protein comprising a non-toxin protein 50 
sequence and a portion of the Clostridium botulinum type A 
toxin, said portion of the Clostridium botulinum type A toxin 
comprising a portion of the sequence of SEQ ID NO: 28. 

2. The fusion protein of claim 1, wherein said portion of 
the Clostridium botulinum type A toxin sequence comprises 55 
SEQ ID NO:23. 

3. The fusion protein of claim 1, wherein said non-toxin 
protein sequence comprises a poly-histidine tract. 

4. The fusion protein of claim 3, which comprises SEQ ID 
NO:26. 6 ° 

5. The fusion protein of claim 1, wherein said fusion 
protein is substantially endotoxin-free. 

6. A host cell containing a recombinant expression vector, 
said vector encoding a protein comprising at least a portion 
of a Clostridium botulinum type A toxin protein sequence of 



SEQ ID NO: 28, and wherein said host cell is capable of 
expressing said protein as a soluble protein in said host cell 
at a level greater than or equal to 0.75% of the total cellular 
protein. 

7. The host cell of claim 6, wherein said portion of a toxin 
comprises SEQ ID NO:23. 

8. The host cell of claim 6, wherein said fusion protein 
comprises SEQ ID NO:26. 

9. The host cell of claim 6, wherein said host cell is 
capable of expressing said protein in said host .cell at a level 
greater than or equal to 20% of the total cellular protein. 

10. A soluble fusion protein, comprising at least a portion 
of Clostridium botulinum C fragment linked to a poly- 
histidine tag. 

***** 



